Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
NPJ Precis Oncol ; 8(1): 5, 2024 Jan 06.
Artigo em Inglês | MEDLINE | ID: mdl-38184744

RESUMO

Drug sensitivity prediction models can aid in personalising cancer therapy, biomarker discovery, and drug design. Such models require survival data from randomised controlled trials which can be time consuming and expensive. In this proof-of-concept study, we demonstrate for the first time that deep learning can link histological patterns in whole slide images (WSIs) of Haematoxylin & Eosin (H&E) stained breast cancer sections with drug sensitivities inferred from cell lines. We employ patient-wise drug sensitivities imputed from gene expression-based mapping of drug effects on cancer cell lines to train a deep learning model that predicts patients' sensitivity to multiple drugs from WSIs. We show that it is possible to use routine WSIs to predict the drug sensitivity profile of a cancer patient for a number of approved and experimental drugs. We also show that the proposed approach can identify cellular and histological patterns associated with drug sensitivity profiles of cancer patients.

2.
Cell Rep Med ; 4(12): 101313, 2023 12 19.
Artigo em Inglês | MEDLINE | ID: mdl-38118424

RESUMO

Identification of the gene expression state of a cancer patient from routine pathology imaging and characterization of its phenotypic effects have significant clinical and therapeutic implications. However, prediction of expression of individual genes from whole slide images (WSIs) is challenging due to co-dependent or correlated expression of multiple genes. Here, we use a purely data-driven approach to first identify groups of genes with co-dependent expression and then predict their status from WSIs using a bespoke graph neural network. These gene groups allow us to capture the gene expression state of a patient with a small number of binary variables that are biologically meaningful and carry histopathological insights for clinical and therapeutic use cases. Prediction of gene expression state based on these gene groups allows associating histological phenotypes (cellular composition, mitotic counts, grading, etc.) with underlying gene expression patterns and opens avenues for gaining biological insights from routine pathology imaging directly.


Assuntos
Neoplasias da Mama , Perfilação da Expressão Gênica , Humanos , Feminino , Transcriptoma/genética , Redes Neurais de Computação , Fenótipo , Neoplasias da Mama/genética
3.
Front Bioinform ; 2: 1083292, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36591335

RESUMO

As practitioners of machine learning in the area of bioinformatics we know that the quality of the results crucially depends on the quality of our labeled data. While there is a tendency to focus on the quality of positive examples, the negative examples are equally as important. In this opinion paper we revisit the problem of choosing negative examples for the task of predicting protein-protein interactions, either among proteins of a given species or for host-pathogen interactions and describe important issues that are prevalent in the current literature. The challenge in creating datasets for this task is the noisy nature of the experimentally derived interactions and the lack of information on non-interacting proteins. A standard approach is to choose random pairs of non-interacting proteins as negative examples. Since the interactomes of all species are only partially known, this leads to a very small percentage of false negatives. This is especially true for host-pathogen interactions. To address this perceived issue, some researchers have chosen to select negative examples as pairs of proteins whose sequence similarity to the positive examples is sufficiently low. This clearly reduces the chance for false negatives, but also makes the problem much easier than it really is, leading to over-optimistic accuracy estimates. We demonstrate the effect of this form of bias using a selection of recent protein interaction prediction methods of varying complexity, and urge researchers to pay attention to the details of generating their datasets for potential biases like this.

4.
IEEE/ACM Trans Comput Biol Bioinform ; 18(3): 1142-1150, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-31443048

RESUMO

Amyloid proteins are implicated in several diseases such as Parkinson's, Alzheimer's, prion diseases, etc. In order to characterize the amyloidogenicity of a given protein, it is important to locate the amyloid forming hotspot regions within the protein as well as to analyze the effects of mutations on these proteins. The biochemical and biological assays used for this purpose can be facilitated by computational means. This paper presents a machine learning method that can predict hotspot amyloidogenic regions within proteins and characterize changes in their amyloidogenicity due to point mutations. The proposed method called MILAMP (Multiple Instance Learning of AMyloid Proteins) achieves high accuracy for identification of amyloid proteins, hotspot localization, and prediction of mutation effects on amyloidogenicity by integrating heterogenous data sources and exploiting common predictive patterns across these tasks through multiple instance learning. The paper presents comprehensive benchmarking experiments to test the predictive performance of MILAMP in comparison to previously published state of the art techniques for amyloid prediction. The python code for the implementation and webserver for MILAMP is available at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#MILAMP.


Assuntos
Proteínas Amiloidogênicas , Biologia Computacional/métodos , Aprendizado de Máquina , Proteínas Amiloidogênicas/química , Proteínas Amiloidogênicas/genética , Proteínas Amiloidogênicas/metabolismo , Bases de Dados de Proteínas , Humanos , Análise de Sequência de Proteína
5.
Nucleic Acids Res ; 49(D1): D622-D629, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33068435

RESUMO

CRISPR-Cas is an anti-viral mechanism of prokaryotes that has been widely adopted for genome editing. To make CRISPR-Cas genome editing more controllable and safer to use, anti-CRISPR proteins have been recently exploited to prevent excessive/prolonged Cas nuclease cleavage. Anti-CRISPR (Acr) proteins are encoded by (pro)phages/(pro)viruses, and have the ability to inhibit their host's CRISPR-Cas systems. We have built an online database AcrDB (http://bcb.unl.edu/AcrDB) by scanning ∼19 000 genomes of prokaryotes and viruses with AcrFinder, a recently developed Acr-Aca (Acr-associated regulator) operon prediction program. Proteins in Acr-Aca operons were further processed by two machine learning-based programs (AcRanker and PaCRISPR) to obtain numerical scores/ranks. Compared to other anti-CRISPR databases, AcrDB has the following unique features: (i) It is a genome-scale database with the largest collection of data (39 799 Acr-Aca operons containing Aca or Acr homologs); (ii) It offers a user-friendly web interface with various functions for browsing, graphically viewing, searching, and batch downloading Acr-Aca operons; (iii) It focuses on the genomic context of Acr and Aca candidates instead of individual Acr protein family and (iv) It collects data with three independent programs each having a unique data mining algorithm for cross validation. AcrDB will be a valuable resource to the anti-CRISPR research community.


Assuntos
Sistemas CRISPR-Cas/genética , Bases de Dados Genéticas , Óperon/genética , Células Procarióticas/metabolismo , Vírus/metabolismo , Internet
6.
BioData Min ; 13(1): 20, 2020 Nov 25.
Artigo em Inglês | MEDLINE | ID: mdl-33292419

RESUMO

BACKGROUND: Determining binding affinity in protein-protein interactions is important in the discovery and design of novel therapeutics and mutagenesis studies. Determination of binding affinity of proteins in the formation of protein complexes requires sophisticated, expensive and time-consuming experimentation which can be replaced with computational methods. Most computational prediction techniques require protein structures that limit their applicability to protein complexes with known structures. In this work, we explore sequence-based protein binding affinity prediction using machine learning. METHOD: We have used protein sequence information instead of protein structures along with machine learning techniques to accurately predict the protein binding affinity. RESULTS: We present our findings that the true generalization performance of even the state-of-the-art sequence-only predictor is far from satisfactory and that the development of machine learning methods for binding affinity prediction with improved generalization performance is still an open problem. We have also proposed a sequence-based novel protein binding affinity predictor called ISLAND which gives better accuracy than existing methods over the same validation set as well as on external independent test dataset. A cloud-based webserver implementation of ISLAND and its python code are available at https://sites.google.com/view/wajidarshad/software . CONCLUSION: This paper highlights the fact that the true generalization performance of even the state-of-the-art sequence-only predictor of binding affinity is far from satisfactory and that the development of effective and practical methods in this domain is still an open problem.

7.
Nucleic Acids Res ; 48(9): 4698-4708, 2020 05 21.
Artigo em Inglês | MEDLINE | ID: mdl-32286628

RESUMO

The increasing use of CRISPR-Cas9 in medicine, agriculture, and synthetic biology has accelerated the drive to discover new CRISPR-Cas inhibitors as potential mechanisms of control for gene editing applications. Many anti-CRISPRs have been found that inhibit the CRISPR-Cas adaptive immune system. However, comparing all currently known anti-CRISPRs does not reveal a shared set of properties for facile bioinformatic identification of new anti-CRISPR families. Here, we describe AcRanker, a machine learning based method to aid direct identification of new potential anti-CRISPRs using only protein sequence information. Using a training set of known anti-CRISPRs, we built a model based on XGBoost ranking. We then applied AcRanker to predict candidate anti-CRISPRs from predicted prophage regions within self-targeting bacterial genomes and discovered two previously unknown anti-CRISPRs: AcrllA20 (ML1) and AcrIIA21 (ML8). We show that AcrIIA20 strongly inhibits Streptococcus iniae Cas9 (SinCas9) and weakly inhibits Streptococcus pyogenes Cas9 (SpyCas9). We also show that AcrIIA21 inhibits SpyCas9, Streptococcus aureus Cas9 (SauCas9) and SinCas9 with low potency. The addition of AcRanker to the anti-CRISPR discovery toolkit allows researchers to directly rank potential anti-CRISPR candidate genes for increased speed in testing and validation of new anti-CRISPRs. A web server implementation for AcRanker is available online at http://acranker.pythonanywhere.com/.


Assuntos
Proteínas de Bactérias/genética , Proteína 9 Associada à CRISPR/antagonistas & inibidores , Aprendizado de Máquina , Proteínas de Bactérias/química , Prófagos/genética , Proteoma , Análise de Sequência de Proteína , Streptococcus/enzimologia , Streptococcus/genética
8.
PLoS One ; 14(12): e0225876, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31794580

RESUMO

Begomoviruses interfere with host plant machinery to evade host defense mechanism by interacting with plant proteins. In the old world, this group of viruses are usually associated with betasatellite that induces severe disease symptoms by encoding a protein, ßC1, which is a pathogenicity determinant. Here, we show that ßC1 encoded by Cotton leaf curl Multan betasatellite (CLCuMB) requires Gossypium hirsutum calmodulin-like protein 11 (Gh-CML11) to infect cotton. First, we used the in silico approach to predict the interaction of CLCuMB-ßC1 with Gh-CML11. A number of sequence- and structure-based in-silico interaction prediction techniques suggested a strong putative binding of CLCuMB-ßC1 with Gh-CML11 in a Ca+2-dependent manner. In-silico interaction prediction was then confirmed by three different experimental approaches: The Gh-CML11 interaction was confirmed using CLCuMB-ßC1 in a yeast two hybrid system and pull down assay. These results were further validated using bimolecular fluorescence complementation system showing the interaction in cytoplasmic veins of Nicotiana benthamiana. Bioinformatics and molecular studies suggested that CLCuMB-ßC1 induces the overexpression of Gh-CML11 protein and ultimately provides calcium as a nutrient source for virus movement and transmission. This is the first comprehensive study on the interaction between CLCuMB-ßC1 and Gh-CML11 proteins which provided insights into our understating of the role of ßC1 in cotton leaf curl disease.


Assuntos
Begomovirus/metabolismo , Calmodulina , Gossypium , Doenças das Plantas , Proteínas de Plantas , Calmodulina/genética , Calmodulina/metabolismo , Gossypium/genética , Gossypium/metabolismo , Gossypium/virologia , Doenças das Plantas/genética , Doenças das Plantas/virologia , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Nicotiana/genética , Nicotiana/metabolismo , Nicotiana/virologia
9.
Front Plant Sci ; 10: 656, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31191577

RESUMO

Cotton leaf curl disease (CLCuD) caused by viruses of genus Begomovirus is a major constraint to cotton (Gossypium hirsutum) production in many cotton-growing regions of the world. Symptoms of the disease are caused by Cotton leaf curl Multan betasatellite (CLCuMB) that encodes a pathogenicity determinant protein, ßC1. Here, we report the identification of interacting regions in ßC1 protein by using computational approaches including sequence recognition, and binding site and interface prediction methods. We show the domain-level interactions based on the structural analysis of G. hirsutum SnRK1 protein and its domains with CLCuMB-ßC1. To verify and validate the in silico predictions, three different experimental approaches, yeast two hybrid, bimolecular fluorescence complementation and pull down assay were used. Our results showed that ubiquitin-associated domain (UBA) and autoinhibitory sequence (AIS) domains of G. hirsutum-encoded SnRK1 are involved in CLCuMB-ßC1 interaction. This is the first comprehensive investigation that combined in silico interaction prediction followed by experimental validation of interaction between CLCuMB-ßC1 and a host protein. We demonstrated that data from computational biology could provide binding site information between CLCuD-associated viruses/satellites and new hosts that lack known binding site information for protein-protein interaction studies. Implications of these findings are discussed.

10.
BMC Bioinformatics ; 19(1): 425, 2018 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-30442086

RESUMO

BACKGROUND: Determining protein-protein interactions and their binding affinity are important in understanding cellular biological processes, discovery and design of novel therapeutics, protein engineering, and mutagenesis studies. Due to the time and effort required in wet lab experiments, computational prediction of binding affinity from sequence or structure is an important area of research. Structure-based methods, though more accurate than sequence-based techniques, are limited in their applicability due to limited availability of protein structure data. RESULTS: In this study, we propose a novel machine learning method for predicting binding affinity that uses protein 3D structure as privileged information at training time while expecting only protein sequence information during testing. Using the method, which is based on the framework of learning using privileged information (LUPI), we have achieved improved performance over corresponding sequence-based binding affinity prediction methods that do not have access to privileged information during training. Our experiments show that with the proposed framework which uses structure only during training, it is possible to achieve classification performance comparable to that which is obtained using structure-based features. Evaluation on an independent test set shows improved performance over the PPA-Pred2 method as well. CONCLUSIONS: The proposed method outperforms several baseline learners and a state-of-the-art binding affinity predictor not only in cross-validation, but also on an additional validation dataset, demonstrating the utility of the LUPI framework for problems that would benefit from classification using structure-based features. The implementation of LUPI developed for this work is expected to be useful in other areas of bioinformatics as well.


Assuntos
Algoritmos , Biologia Computacional/métodos , Aprendizado de Máquina , Proteínas/metabolismo , Sequência de Aminoácidos , Ligantes , Ligação Proteica , Proteínas/química , Curva ROC , Reprodutibilidade dos Testes , Máquina de Vetores de Suporte
11.
J Bioinform Comput Biol ; 16(4): 1850014, 2018 08.
Artigo em Inglês | MEDLINE | ID: mdl-30060698

RESUMO

Detection of protein-protein interactions (PPIs) plays a vital role in molecular biology. Particularly, pathogenic infections are caused by interactions of host and pathogen proteins. It is important to identify host-pathogen interactions (HPIs) to discover new drugs to counter infectious diseases. Conventional wet lab PPI detection techniques have limitations in terms of cost and large-scale application. Hence, computational approaches are developed to predict PPIs. This study aims to develop machine learning models to predict inter-species PPIs with a special interest in HPIs. Specifically, we focus on seeking answers to three questions that arise while developing an HPI predictor: (1) How should negative training examples be selected? (2) Does assigning sample weights to individual negative examples based on their similarity to positive examples improve generalization performance? and, (3) What should be the size of negative samples as compared to the positive samples during training and evaluation? We compare two available methods for negative sampling: random versus DeNovo sampling and our experiments show that DeNovo sampling offers better accuracy. However, our experiments also show that generalization performance can be improved further by using a soft DeNovo approach that assigns sample weights to negative examples inversely proportional to their similarity to known positive examples during training. Based on our findings, we have also developed an HPI predictor called HOPITOR (Host-Pathogen Interaction Predictor) that can predict interactions between human and viral proteins. The HOPITOR web server can be accessed at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#HoPItor .


Assuntos
Biologia Computacional/métodos , Interações Hospedeiro-Patógeno/fisiologia , Mapeamento de Interação de Proteínas/métodos , Software , Proteínas Virais/metabolismo , Área Sob a Curva , Simulação por Computador , Bases de Dados de Proteínas , Internet , Aprendizado de Máquina , Distribuição Aleatória , Fator de Transcrição STAT1/metabolismo , Fator de Transcrição STAT2/metabolismo
12.
J Med Syst ; 42(1): 7, 2017 Nov 21.
Artigo em Inglês | MEDLINE | ID: mdl-29164340

RESUMO

Nuclei detection in histology images is an essential part of computer aided diagnosis of cancers and tumors. It is a challenging task due to diverse and complicated structures of cells. In this work, we present an automated technique for detection of cellular nuclei in hematoxylin and eosin stained histopathology images. Our proposed approach is based on kernelized correlation filters. Correlation filters have been widely used in object detection and tracking applications but their strength has not been explored in the medical imaging domain up till now. Our experimental results show that the proposed scheme gives state of the art accuracy and can learn complex nuclear morphologies. Like deep learning approaches, the proposed filters do not require engineering of image features as they can operate directly on histopathology images without significant preprocessing. However, unlike deep learning methods, the large-margin correlation filters developed in this work are interpretable, computationally efficient and do not require specialized or expensive computing hardware. AVAILABILITY: A cloud based webserver of the proposed method and its python implementation can be accessed at the following URL: http://faculty.pieas.edu.pk/fayyaz/software.html#corehist .


Assuntos
Núcleo Celular/patologia , Interpretação de Imagem Assistida por Computador/métodos , Aprendizado de Máquina , Análise de Fourier , Humanos
13.
Proteins ; 85(9): 1724-1740, 2017 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-28598584

RESUMO

Due to Ca2+ -dependent binding and the sequence diversity of Calmodulin (CaM) binding proteins, identifying CaM interactions and binding sites in the wet-lab is tedious and costly. Therefore, computational methods for this purpose are crucial to the design of such wet-lab experiments. We present an algorithm suite called CaMELS (CalModulin intEraction Learning System) for predicting proteins that interact with CaM as well as their binding sites using sequence information alone. CaMELS offers state of the art accuracy for both CaM interaction and binding site prediction and can aid biologists in studying CaM binding proteins. For CaM interaction prediction, CaMELS uses protein sequence features coupled with a large-margin classifier. CaMELS models the binding site prediction problem using multiple instance machine learning with a custom optimization algorithm which allows more effective learning over imprecisely annotated CaM-binding sites during training. CaMELS has been extensively benchmarked using a variety of data sets, mutagenic studies, proteome-wide Gene Ontology enrichment analyses and protein structures. Our experiments indicate that CaMELS outperforms simple motif-based search and other existing methods for interaction and binding site prediction. We have also found that the whole sequence of a protein, rather than just its binding site, is important for predicting its interaction with CaM. Using the machine learning model in CaMELS, we have identified important features of protein sequences for CaM interaction prediction as well as characteristic amino acid sub-sequences and their relative position for identifying CaM binding sites. Python code for training and evaluating CaMELS together with a webserver implementation is available at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#camels.


Assuntos
Proteínas de Ligação a Calmodulina/química , Calmodulina/química , Proteoma/genética , Software , Algoritmos , Sequência de Aminoácidos , Sítios de Ligação , Proteínas de Ligação a Calmodulina/genética , Simulação por Computador , Ligação Proteica , Proteoma/química
14.
Comput Biol Med ; 79: 68-79, 2016 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-27764717

RESUMO

Feature selection and ranking is of great importance in the analysis of biomedical data. In addition to reducing the number of features used in classification or other machine learning tasks, it allows us to extract meaningful biological and medical information from a machine learning model. Most existing approaches in this domain do not directly model the fact that the relative importance of features can be different in different regions of the feature space. In this work, we present a context aware feature ranking algorithm called CAFÉ-Map. CAFÉ-Map is a locally linear feature ranking framework that allows recognition of important features in any given region of the feature space or for any individual example. This allows for simultaneous classification and feature ranking in an interpretable manner. We have benchmarked CAFÉ-Map on a number of toy and real world biomedical data sets. Our comparative study with a number of published methods shows that CAFÉ-Map achieves better accuracies on these data sets. The top ranking features obtained through CAFÉ-Map in a gene profiling study correlate very well with the importance of different genes reported in the literature. Furthermore, CAFÉ-Map provides a more in-depth analysis of feature ranking at the level of individual examples. AVAILABILITY: CAFÉ-Map Python code is available at: http://faculty.pieas.edu.pk/fayyaz/software.html#cafemap . The CAFÉ-Map package supports parallelization and sparse data and provides example scripts for classification. This code can be used to reconstruct the results given in this paper.


Assuntos
Algoritmos , Biologia Computacional/métodos , Mineração de Dados/métodos , Software , Análise por Conglomerados , Perfilação da Expressão Gênica , Internet , Aprendizado de Máquina , Máquina de Vetores de Suporte
15.
J Bioinform Comput Biol ; 14(3): 1650011, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-26932275

RESUMO

The study of interactions between host and pathogen proteins is important for understanding the underlying mechanisms of infectious diseases and for developing novel therapeutic solutions. Wet-lab techniques for detecting protein-protein interactions (PPIs) can benefit from computational predictions. Machine learning is one of the computational approaches that can assist biologists by predicting promising PPIs. A number of machine learning based methods for predicting host-pathogen interactions (HPI) have been proposed in the literature. The techniques used for assessing the accuracy of such predictors are of critical importance in this domain. In this paper, we question the effectiveness of K-fold cross-validation for estimating the generalization ability of HPI prediction for proteins with no known interactions. K-fold cross-validation does not model this scenario, and we demonstrate a sizable difference between its performance and the performance of an alternative evaluation scheme called leave one pathogen protein out (LOPO) cross-validation. LOPO is more effective in modeling the real world use of HPI predictors, specifically for cases in which no information about the interacting partners of a pathogen protein is available during training. We also point out that currently used metrics such as areas under the precision-recall or receiver operating characteristic curves are not intuitive to biologists and propose simpler and more directly interpretable metrics for this purpose.


Assuntos
Interações Hospedeiro-Patógeno , Mapeamento de Interação de Proteínas/métodos , Adenoviridae/patogenicidade , Área Sob a Curva , Bases de Dados de Proteínas , Evolução Molecular , Proteínas do Vírus da Imunodeficiência Humana/metabolismo , Humanos , Aprendizado de Máquina , Proteínas Virais/metabolismo
16.
Proteins ; 82(7): 1142-55, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24243399

RESUMO

We present a novel partner-specific protein-protein interaction site prediction method called PAIRpred. Unlike most existing machine learning binding site prediction methods, PAIRpred uses information from both proteins in a protein complex to predict pairs of interacting residues from the two proteins. PAIRpred captures sequence and structure information about residue pairs through pairwise kernels that are used for training a support vector machine classifier. As a result, PAIRpred presents a more detailed model of protein binding, and offers state of the art accuracy in predicting binding sites at the protein level as well as inter-protein residue contacts at the complex level. We demonstrate PAIRpred's performance on Docking Benchmark 4.0 and recent CAPRI targets. We present a detailed performance analysis outlining the contribution of different sequence and structure features, together with a comparison to a variety of existing interface prediction techniques. We have also studied the impact of binding-associated conformational change on prediction accuracy and found PAIRpred to be more robust to such structural changes than existing schemes. As an illustration of the potential applications of PAIRpred, we provide a case study in which PAIRpred is used to analyze the nature and specificity of the interface in the interaction of human ISG15 protein with NS1 protein from influenza A virus. Python code for PAIRpred is available at http://combi.cs.colostate.edu/supplements/pairpred/.


Assuntos
Sítios de Ligação , Ligação Proteica , Proteínas/química , Proteínas/metabolismo , Análise de Sequência de Proteína/métodos , Software , Biologia Computacional , Humanos , Modelos Moleculares , Conformação Proteica , Máquina de Vetores de Suporte
17.
Bioinformatics ; 28(18): i416-i422, 2012 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-22962461

RESUMO

MOTIVATION: Calmodulin (CaM) is a ubiquitously conserved protein that acts as a calcium sensor, and interacts with a large number of proteins. Detection of CaM binding proteins and their interaction sites experimentally requires a significant effort, so accurate methods for their prediction are important. RESULTS: We present a novel algorithm (MI-1 SVM) for binding site prediction and evaluate its performance on a set of CaM-binding proteins extracted from the Calmodulin Target Database. Our approach directly models the problem of binding site prediction as a large-margin classification problem, and is able to take into account uncertainty in binding site location. We show that the proposed algorithm performs better than the standard SVM formulation, and illustrate its ability to recover known CaM binding motifs. A highly accurate cascaded classification approach using the proposed binding site prediction method to predict CaM binding proteins in Arabidopsis thaliana is also presented. AVAILABILITY: Matlab code for training MI-1 SVM and the cascaded classification approach is available on request. CONTACT: fayyazafsar@gmail.com or asa@cs.colostate.edu.


Assuntos
Proteínas de Ligação a Calmodulina/química , Calmodulina/metabolismo , Máquina de Vetores de Suporte , Proteínas de Arabidopsis/química , Proteínas de Arabidopsis/metabolismo , Sítios de Ligação , Calmodulina/química , Proteínas de Ligação a Calmodulina/metabolismo , Domínios e Motivos de Interação entre Proteínas
18.
J Med Syst ; 36(5): 3163-72, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22072280

RESUMO

This paper presents a novel approach for detection of Fatty liver disease (FLD) and Heterogeneous liver using textural analysis of liver ultrasound images. The proposed system is able to automatically assign a representative region of interest (ROI) in a liver ultrasound which is subsequently used for diagnosis. This ROI is analyzed using Wavelet Packet Transform (WPT) and a number of statistical features are obtained. A multi-class linear support vector machine (SVM) is then used for classification. The proposed system gives an overall accuracy of ~95% which clearly illustrates the efficacy of the system.


Assuntos
Fígado Gorduroso/classificação , Fígado Gorduroso/diagnóstico por imagem , Interpretação de Imagem Assistida por Computador/métodos , Fígado Gorduroso/diagnóstico , Humanos , Fígado/diagnóstico por imagem , Sensibilidade e Especificidade , Máquina de Vetores de Suporte , Ultrassonografia , Análise de Ondaletas
19.
Physiol Meas ; 29(5): 555-70, 2008 May.
Artigo em Inglês | MEDLINE | ID: mdl-18427158

RESUMO

This paper presents a robust technique for the classification of six types of heartbeats through an electrocardiogram (ECG). Features extracted from the QRS complex of the ECG using a wavelet transform along with the instantaneous RR-interval are used for beat classification. The wavelet transform utilized for feature extraction in this paper can also be employed for QRS delineation, leading to reduction in overall system complexity as no separate feature extraction stage would be required in the practical implementation of the system. Only 11 features are used for beat classification with the classification accuracy of approximately 99.5% through a KNN classifier. Another main advantage of this method is its robustness to noise, which is illustrated in this paper through experimental results. Furthermore, principal component analysis (PCA) has been used for feature reduction, which reduces the number of features from 11 to 6 while retaining the high beat classification accuracy. Due to reduction in computational complexity (using six features, the time required is approximately 4 ms per beat), a simple classifier and noise robustness (at 10 dB signal-to-noise ratio, accuracy is 95%), this method offers substantial advantages over previous techniques for implementation in a practical ECG analyzer.


Assuntos
Arritmias Cardíacas/diagnóstico , Arritmias Cardíacas/fisiopatologia , Diagnóstico por Computador/métodos , Eletrocardiografia/métodos , Frequência Cardíaca , Reconhecimento Automatizado de Padrão/métodos , Processamento de Sinais Assistido por Computador , Algoritmos , Inteligência Artificial , Humanos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...